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ABSTRACT 


Given  a  solution  x*  and  an  a  priori  estimated  cost  vector  c,  the  inverse  optimization  problem  is  to 
identify  another  cost  vector  d  so  that  x*  is  optimal  with  respect  to  the  cost  vector  d  and  the  deviation  of  d 
from  c  is  minimum.  In  this  paper,  we  consider  the  inverse  spanning  tree  problem  on  an  undirected  graph  G  = 
(N,  A)  with  n  nodes  and  m  arcs,  and  where  the  deviation  between  c  and  d  is  defined  by 
the  rectilinear  distance  between  the  two  vectors  (that  is,  L,  norm).  We  show  that  the  inverse  spanning  tree 

problem  can  be  formulated  as  the  dual  of  an  assignment  problem  on  a  bipartite  network  G  =  (N  ,  A  )  with 
N°  =  N^  u  N^  and  A°  c  N^  x  N^.  The  bipartite  network  satisfies  the  property  that  IN^l  =  (n  - 1),  IN^I  =  (m 
-  n  +  1),  and  lA  I  =  0(nm).  In  general,  IN  I  <  <  IN  I.  Using  this  special  structure  of  the  assignment 
problem,  we  develop  a  specific  implementation  of  the  successive  shortest  path  algorithm  that  solves  the 
inverse  spanning  tree  problem  in  0(n  )  time.  We  also  consider  the  weighted  version  of  the  inverse  spanning 
tree  problem  where  we  minimize  the  sum  of  the  weighted  deviations  of  arcs  and  show  that  it  can  be 
formulated  as  the  dual  of  the  transportation  problem.  Using  a  cost  scaling  algorithm,  the  transportation 
problem  can  be  solved  in  0(n  m  iog(nC)),  where  C  denotes  the  largest  arc  cost  in  the  data.  Finally,  we 
consider  a  minimax  version  of  the  inverse  spanning  tree  problem  and  show  that  it  can  be  solved  in  0(n  ) 
time. 
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1.     INTRODUCTION 

Let  X  denote  the  set  of  feasible  solutions  of  an  optimization  problem.  Given  a  solution  x*  e  X  and 
an  a  priori  estimated  cost  vector  c,  the  inverse  optimization  problem  is  to  identify  another  cost  vector  d  so 
that  dx*  <  dx  for  all  x  e  X  and  such  that  the  deviation  of  d  from  c  is  minimum.  Roughly  speaking,  the 
inverse  optimization  problem  is  to  identify  a  cost  vector  d  which  is  nearest  to  a  specified  cost  vector  c  and 
with  respect  to  which  the  given  solution  x*  is  an  optimal  solution  of  the  optimization  problem.  In  this 
paper,  we  study  inverse  minimum  spanning  tree  problems  using  three  different  ways  to  measure  deviations 
between  the  cost  vectors. 

Bitran,  Chandru,  Sempolinski  and  Shapiro  [1981]  studied  the  inverse  optimization  for  the 
capacitated  plan  location  problem.  To  our  knowledge,  Bitran  et  al.  introduced  the  concept  of  inverse 
optimization,  although  it  was  anticipated  by  Everett  [1963]  and  by  the  literature  in  numerical  analysis. 
Inverse  network  optimization  problems  were  first  studied  by  Burton  and  Toint  [1992, 1994].  They  studied  the 
inverse  multiple-source  shortest  path  problem  with  deviation  between  two  vectors  c  and  d  measured  by  the 
L2  norm.  They  show  applications  of  these  problem  in  traffic  modeling  and  seismic  tomography,  and  suggest 
a  nonlinear  programming  algorithm  to  solve  the  problem.  Inverse  minimum  cost  flow  problems  with  Li ,  Lo 
and  L^  norms  have  been  studied  by  Sokkalingam  [1995].  To  the  best  of  our  understanding,  prior  to  tlus 
research  no  one  has  studied  the  inverse  spaiming  tiee  problems. 

Inverse  optimization  is  an  alternative  approach  to  measure  deviation  from  optimality.  Rather 
than  measuring  the  distance  of  a  solution  x*  from  optimality  by  comparing  its  objective  value  to  the 
objective  value  to  the  optimum,  one  poses  the  following  question:  How  much  would  one  need  to  perturb  the 
data  so  that  x*  is  optimum?  This  inverse  perspective  is  of  special  interest  when  the  cost  data  is  subject  to 
measurement  error,  which  is  typically  the  case.  Also,  the  inverse  optimization  objective  has  the 
advantage  that  it  is  less  sensitive  to  changes  Ln  the  cost  data.  Changing  a  cost  coefficient  by  one  unit  can 
have  a  major  impact  in  deviation  from  optimality  in  the  usual  sense,  but  it  can  only  increase  the  inverse 
objective  function  by  1  unit.  We  also  note  that  the  concept  of  e-optimality,  which  is  a  critical  aspect  of  the 
Goldberg-Tarjan  [1987]  algorithm  for  the  minimum  cost  flow  problem,  is  closely  related  to  inverse 
optimization.  One  definition  of  e-optimality  is  the  following:  a  solution  x*  is  called  e-optimal  for  the 
minimum  cost  flow  problem  if  it  is  possible  to  perturb  each  cost  coefficient  by  at  most  e  so  that  x*  is  optimal 
for  the  perturbed  problem. 

In  this  paper,  we  first  study  the  inverse  spanning  tree  problem  in  an  undirected  graph  G  =  (N,  A) 
with  n  nodes,  m  arcs,  and  in  which  the  deviation  between  two  arc  cost  vectors  c  and  d  is  defined  by  the 
rectilinear  distance  between  the  two  vectors.  We  show  that  the  inverse  sparming  tree  problem  can  be 
formulated  as  the  dual  of  an  assignment  problem  on  a  bipartite  graph  with  G°  =  (N^,  A*^)  with  N^  =  N^  u 
N^  and  A°  c  N^  x  N^,   The  bipartite  network  G°  satisfies  tiie  property  that  IN^I  =  (n  - 1),  IN^I  =  (m  -  n  +  1), 


and  lA^I  =  O(njTi).  Often,  n  <  <  m;  hence  IN  I  <  <  IN  /  Such  an  assignment  problem  is  called  unbalanced. 
We  develop  a  specific  implementation  of  the  well  known  successive  shortest  path  algorithm  for  the 
assignment  problem  on  G  which  obtains  a  shortest  path  in  an  amortized  (that  is,  average)  time  of  O(n^). 
This  running  time  is  often  much  better  than  the  running  time  of  0(IA  I),  which  is  the  number  of  arcs  in  G^. 
The  resulting  algorithm  solves  the  assignment  problem,  and  hence  the  inverse  spanning  tree  problem  in 
O(n^)  time. 

Next  we  consider  the  weighted  version  of  the  inverse  spanning  tree  problem  in  which  the  deviation 
between  two  cost  vectors  c  and  d  is  defined  by  the  weighted  rectilinear  distance  between  the  two  vectors. 
We  show  that  the  weighted  spanning  tree  problem  can  be  formulated  as  the  dual  of  a  transportation 
problem  on  G  and  can  be  solved  by  a  cost  scaling  algorithm  in  0(n  m  log(nC))  time,  where  C  denotes  the 
largest  arc  cost.  Finally,  we  consider  the  minimax  version  of  the  iriverse  spanning  tree  problem  and  show 
that  the  minimax  inverse  spanning  tree  problem  can  be  solved  in  0(n  )  time. 

This  paper  is  organized  as  follows.  Section  2  shows  that  the  inverse  sparuiing  tree  problem  is  the 
dual  of  an  unbalanced  assignment  problem.  Section  3  describes  an  algorithm  for  the  assigiunent  problem  and 
shows  how  to  obtain  an  optimal  solution  of  the  inverse  spanning  tree  problem  from  the  optimal  solution  of 
the  assignment  problem.  Section  4  considers  the  weighted  inverse  sparining  tree  problem,  and  Section  5 
studies  the  minimax  version  of  the  inverse  spanning  tree  problem. 

2.  TRANSFORMATION  TO  THE  ASSIGNMENT  PROBLEM 

In  this  section,  we  show  that  the  inverse  spanning  tree  problem  can  be  transformed  to  an  assigrxment 
problem.  In  this  section,  as  well  as  elsewhere,  we  follow  the  network  notation  given  in  the  book  of  Ahuja, 
Magnanti  and  Orlin  [1993],  and  refer  the  reader  to  the  same.  We  denote  the  complement  of  any  set  S  by 
placing  a  bar  on  it,  that  is,  by  S 

Let  G  =  (N,  A)  be  an  undirected  network  consisting  of  the  node  set  N  and  the  arc  set  A.  Let  n  =  INI 
and  m  =  lAI.  We  assume  that  N  =  |1,  2, ... ,  n),  and  A  =  la^,  aj, ... ,  a^}.  The  data  of  the  inverse  spanning  tree 
problem  consists  of  a  spemning  tree  T  of  G  and  an  arc  cost  vector  c  with  Cj  denoting  the  cost  of  arc  a^.  We 
assume  without  any  loss  of  generality  that  T  =  (a^,  aj, ...  ,  ^^.a)-  We  refer  to  the  arcs  in  T  as  tree  arcs  and 
the  arcs  in  (a^^,  a^^^, ... ,  a^^^)  as  nontree  arcs.  The  objective  in  the  inverse  spanning  tree  problem  is  to  find  an 
arc  cost  vector  d  such  that  T  is  optimal  with  respect  to  d  and  2^  ■  j  ICj  -  d-l  is  minimum. 

In  the  given  spanning  tree  T  ,  there  is  a  unique  tree  path  between  any  two  nodes;  we  denote  by  the 
set  p.  the  indices  of  the  tree  arcs  on  the  path  in  T  connecting  the  two  endpoints  of  arc  a-.    It  is  well  known 

(see,  for  example,  Ahuja,  Magnanti  and  Orlin  [1993])  that  T  is  a  minimum  spanning  tree  with  respect  to  the 
arc  cost  vector  d  if  cind  only  if 


dj  <  dj  for  each  i  e  P-  and  for  each  j  =  n, ... ,  m.  (1) 

The  inverse  spanning  tree  problem  can  alternatively  be  conceived  in  the  following  manner. 
Consider  the  sparming  tree  T  .  The  tree  T  may  or  may  not  be  optimal  with  respect  to  the  given  cost  vector  c. 
If  it  is,  then  d  =  c  is  the  desired  cost  vector  with  zero  objective  function  value.  If  not,  then  we  must  perturb 
the  given  cost  vector  c  by  a  so  that  T  is  optimal  with  respect  to  (c  +  a)  and  Z,  ■  ^  la-l  is  minimum.    This 

observation  allows  us  to  formulate  the  inverse  spaiming  tree  problem  as  the  following  mathematical 
program:  " 

Minimize  2^ -J  la-l  (2a) 

subject  to 

Cj  +  a-  <  Cj  +  a-  for  each  i  e  P-  and  for  each  j  =  n, ... ,  m ,  (2b) 

a-  is  unrestricted  for  each  j  =  1,  2, ... ,  m.  (2c) 

Property  1.  There  exists  an  optimal  solution  a  of  (2)  in  which  a-  <  Ofor  each  i  =  1  to  (n-1)  and  a-  >  0  for  each 
j  =  nto  m. 

Proof.  Observe  that  if  a-  >  0  for  some  i,  1  <  i  <  (n-1),  then  we  can  set  aj  =  0  without  violating  any  conditions 
in  (2b)  and  without  worsening  the  objective  function  value  (2a).  This  establishes  the  first  claim.  A  similar 
argument  establishes  the  second  claim.  ♦ 

Using  Property  1,  we  obtain  the  following  equivalent  formulation  of  the  inverse  sparming  tree 
problem: 

-,m  „(n-l) 


Minimize 


^=n  «j   -    ^i=l   «i  (3a) 


subject  to 


c-  +  a-  <  c-  +  a-  for  each  i  e  P-  and  for  each  j  =  n, ... ,  m ,  (3b) 

ocj  <  0  for  each  i  =  1  to  (n-1),  and  a-  >  0  for  each  j  =  n  to  m.  (3c) 

We  now  reformulate  (3)  using  the  concept  of  Path  Graph,  which  allows  us  to  express  the  constraints 
in  (3b)  in  a  marmer  more  suitable  for  manipulation.  The  path  graph,  which  we  denote  by  G  =  (IST,  A  ) 
with  N°  =  N^  u  N^  satisfies  N^  =  (1,  2,  ...  ,  (n-1)),  N^  =  {n,  n+1,  ...  ,  m),  and  A°  =  {(i,  j)  :  i  e  Pj,  n  <  j  <  m}. 

Observe  that  a''  contains  an  arc  (i,  j)  for  every  inequality  in  (3b).   (We  may,  however,  exclude  those  arcs  (i, 
j)  for  which  c-  <  c-  because  any  vector  a  satisfying  (3c)  will  automatically  satisfy  c-  +  a-  <  C:  +  a-.)  We  will 


also  like  to  restate  (3)  in  the  maximization  form;  we  can  do  it  by  maximizing  the  negative  of  (3a).  The 
modified  formulation  of  the  inverse  spanning  tree  problem  is  as  follows: 

Maximize    Z       ,  a-  -  Z      ,  a-  (4a) 

subject  to 

ttj  -  a;  <  f jj       for  each  arc  (i,  j)  e  A°,  (4b) 

O:  <  0  for  each  node  i  e  N   and  a-  >  0  for  each  node  j  e  N  ,  (4c) 

where  f-  =  c-  -  Cj  for  each  arc  (i,  j)  e  A  .  The  formulation  (4)  is  a  linear  programming  problem  and  has  an 
associated  dual.  If  we  associate  the  dual  variable  x--  with  the  arc  (i,  j)  in  (4b),  then  the  dual  of  (4)  can  be 
stated  as  follows: 

Minimized  nf;;X::  (5a) 

(i,j)eA°  •)    'J 

subject  to 

^|j.(i,j)6A0l  '^ij              -  ^  for  each  node  i  €  N^  (5b) 

■'^(i:(i,j)eAO)  ''ij             -  -1  for  each  node  j  e  N^  (5c) 

Xjj  >  0       for  each  arc  (i,  j)  €  A°.  (5d) 

The  formulation  (5)  is  a  variant  of  the  well  known  assignment  problem.  In  the  standard  assignment 

1  n 

problem,  the  constraints  (5b)  and  (5c)  are  in  equality  form  and  IN  I  =  IN  I.  In  our  variant,  it  is  possible  that 
IN  I  <  <  IN  I.   We  refer  to  the  formulation  (5)  as  the  unbalanced  assignment  problem. 

3.     SOLVING  THE  ASSIGNMENT  PROBLEM 

We  have  shown  in  Section  2  that  the  inverse  minimum  spanning  tree  problem  can  be  transformed  to 
an  unbalanced  assignment  problem  on  a  bipartite  network  G  =  (N  u  N  ,  A  ),  where  generally  INI  <  < 
I N  I .  In  this  section,  we  develop  a  special  purpose  algorithm  to  solve  the  assignment  problem,  using  the 
fact  that  INI  <  <  INI  to  obtain  a  speedup.   The  algorithm  runs  in  0(n' )  time.   We  point  out  that  for  a 

9  0  '\ 

dense  network  G  (that  is,  m  =  Q{n  )),  the  network  G  may  contain  as  many  as  ti(n  )  arcs.  Since  any 
algorithm  for  the  assignment  problem  must  look  at  each  arc  at  least  one,  our  0(n' )  algorithm  for  solving 
the  assignment  problem  is  an  optimal  algorithm  for  some  classes  of  network  G. 

Consider  any  feasible  assignment  x  of  (5)  which  is  a  0-1  vector.  For  this  assignment,  the  nodes 
which  satisfy  the  constraints  (5b)  or  (5c)  with  equality  are  called  matched  nodes,  and  unmatched  nodes 


otherwise.  Often  INI  <  INI,  and  most  nodes  in  N  are  not  be  matched  in  an  optimal  assignment.  As  a 
matter  of  fact,  in  a  minimum  cost  assigrmient,  it  is  possible  that  some  nodes  in  N  are  not  matched. 

We  can  transform  the  unbalanced  assigrmient  problem  (5)  on  G  =  (N  u  N  ,  A  )  into  a  minimum  cost 
flow  problem  in  the  following  manner.  We  introduce  a  source  node  s  with  supply  b(s)  =  n,  and  add  an  arc  (s, 
i)  for  each  i  e  N  with  cost  f  ■  =  0  and  capacity  u^-  =  1.  Similarly,  we  introduce  a  sink  node  t  with  b(t)  =  -n, 
and  add  an  arc  (j,  t)  for  each  j  e  N  with  cost  fj  =  0  and  capacity  u-^  =  1 .  We  also  add  an  arc  (s,  t)  with  f  ,  =  0 
and  u.  =  (n+1).  We  set  the  supply /demand  b(i)  of  each  node  i  e  N  to  zero.  We  set  the  capacity  of  each  arc 
(i,  j)  6  A  to  n;  the  cost  of  this  arc  is  f--.  Let  G'  =  (N',  A)  denote  the  resulting  network.  Let  n'  =  INI  and  m'  = 
I  a'  I .  Observe  that  n'  =  (m+2)  and  m'  =  0(nm).  Also  observe  that  similar  to  G^,  G'  too  is  a  bipartite 
network.  We  denote  by  A'(i)  th^  set  of  arc  in  A  emanating  from  node  i.  The  following  property  estabUshes 
a  cormection  between  the  mirumum  cost  flow  problem  in  G  and  the  assignment  problem  in  G  . 

Property  2.  There  is  a  one-to-one  correspondence  between  feasible  flows  in  G'  and  feasible  assignments  in  G  , 
and  the  cost  of  the  flows  and  assignments  are  the  same. 

Proof.  Consider  a  feasible  flow  x  in  G  .  Ehminating  nodes  s  and  t  and  the  arcs  incident  on  these  nodes  gives  a 
solution  of  the  assignment  problem  in  G  having  the  same  cost.  Now  consider  a  feasible  assignment  x  in  G  . 
For  each  arc  (i,  j)  e  A^  with  x-  =  1,  we  send  one  unit  of  flow  on  arcs  in  (s,  i)  and  (j,  t).  Then  we  send  sufficient 

flow  on  arc  (s,  t)  to  satisfy  the  supply /demand  constraints  of  nodes  s  and  t.  ♦ 

We  can  use  the  well  known  successive  shortest  path  algorithm  to  solve  the  minimum  cost  flow 
problem  in  G'  (see,  for  example,  Ahuja,  Magnanti  and  Orlin  [1993]  for  a  detailed  description  of  this 
algorithm).  The  successive  shortest  path  algorithm  maintains  a  primal  infeasible  flow  x  and  a  dual 
feasible  solution  n  satisfying  the  following  optimality  conditions: 

(i)  If0<xjj<ujj,thenfj=0,  (6a) 

(ii)        Ifx..  =  0,thenfj>0,  (6b) 

(iii)       IfXij  =  Uij,then^<0,  (6c) 

where  fJJ  =  fjj  -  7:(i)  +  7t(j). 

The  successive  shortest  path  algorithm  starts  with  x  =  0  and  requires  a  dual  solution  n  which 
together  with  x  =  0  satisfies  the  optimaUty  conditions  given  in  (6).  It  is  easy  to  verify  that  7r(j)  =  0  for  all  j 
e  N^u(t),  T:(i)  =  min{fj.  :  (i,  j)  e  A^)  for  all  i  e  N-^,  and  7i(s)  =  min{7i(i)  :  i  e  N-^)  is  one  such  solution.  The 

successive  shortest  path  algorithm  proceeds  by  augmenting  unit  flow  along  shortest  paths  from  node  s  to 
node  t  in  the  residual  network  G'(x)  defined  with  respect  to  the  flow  x.  In  general,  the  shortest  paths  are 
computed  with  respect  to  the  reduced  costs  f^  which  are  always  nonnegative.  An  iteration  of  the  successive 
shortest  path  algorithm  consists  of  determining  shortest  path  distances  d(i)'s  from  node  s  to  all  other  nodes 
in  the  residual  network  G'(x)  with  respect  to  the  arc  costs  if-  s,  updating  node  potentials  as  Ji  =  7t  -  d,  and 


augmenting  unit  flow  along  the  shortest  path  from  noc'e  s  to  node  t.  After  a  number  of  shortest  path 
augmentations,  the  arc  (s,  t)  will  become  the  shortest  path  from  node  s  to  node  t,  and  the  algorithm  will 
terminate  after  augmenting  flow  on  this  arc.  (Observe  that  arc  (s,t)  will  become  a  shortest  path  because  (i) 
b(s)  =  n,  (ii)  I N  1  =  (n-1),  (iii)  the  capacity  of  each  arc  (s,  i)  is  1,  which  imply  that  any  feasible  flow  will 
have  positive  flow  on  the  arc  (s,  t).  )  Let  x  denote  the  optimal  flow  and  7t  denote  the  optimal  dual  solution 
of  the  minimum  cost  flow  problem. 

Property  3.  The  optimal  primal  solution  x  together  with  the  optimal  dual  solution  k  of  the  minimum  cost 
flow  problem  satisfies  the  follozving  conditions: 

(a)  7i(s)  =  0  and  n(t)  =  0. 

(b)  For  each  arc  (s,  i),  if  x^-  =  1  then  K(i)  <0;  if  x^j  =  0  then  Mi)  >  0. 

(c)  For  some  arc  (j,  t),  ifx^  =  1  then  rt(j)  >0;  ifx^  =  0  then  Mj)  <0. 

Proof.  We  can  assume  without  any  loss  of  generality  that  K(t)  =  0.  Since  0  <  x^  <  Uj  ,  it  follows  from  (6a) 
that  f^f  =  0  which  implies  7t(s)  =  0,  establishing  part  (a)  of  the  property.  Now  consider  part  (b).  If  x^j  =  1 
then  it  follows  from  (6c)  that  f^;  <  0  which  implies  7t(i)  <  0.  If  x^^  =  0,  then  it  follows  from  (6b)  that  f^j  S  0 
which  implies  7i(i)  >  0.  This  establishes  part  (b).  Next  consider  part  (c).  If  x-j  =  1,  then  it  follows  from  (6c) 
that  fj't  S  0  which  implies  7t(j)  >  0.  If  x-^  =  0,  then  it  follows  from  (6b)  that  fjj  >  0  which  implies  7r(j)  <  0, 
establishing  part  (c).      ♦ 

We  will  now  show  that  the  solutions  x  and  n  can  be  used  to  obtain  an  optimal  solution  of  the  inverse 
spaiming  tree  problem.  It  follows  from  Property  2  that  the  flow  x  has  a  corresponding  assigrunent.  For 
simplicity  of  notations,  we  view  x  as  denoting  the  corresponding  assignment  as  well. 


Lemma  1.     The  vector  a  defined  by  (7)  is  an  optimal  solution  of  the  inverse  spanning  tree  problem: 

f  Tc(i)  for  every  matched  node  i  in  x 
i  ~  lo       for  every   unmatched  node  i  in  x. 


Proof.  It  follows  from  Property  3(b)  and  (c)  that  a  is  feasible  to  (4).  Now  consider  a  matched  arc  (i,  j)  . 
Since  0  <  Xj-  <  Uj:,  it  follows  from  (6a)  that  fj-  =  0,  and  thus  f-  -  7i(i)  +  7t(j)  =  0.  Since  both  the  nodes  i  and  j  are 
matched,  using  (7)  we  obtain  that  a^  -  a-  =  f^-.  Summing  these  equations  for  all  matched  arcs  and  using  the 
facts  that  (i)  x-  =  0  for  each  unmatched  arc  (i,  j),  and  (ii)  a^  =  0  for  each  unmatched  node  i  yields 

^ieNl  «,  -^J6N2«j  =^{i,j)6A0fij>'ij- 

Since  a  is  a  feasible  solution  of  the  primal  problem  (4),  x  is  a  feasible  solution  to  the  dual  problem,  and 
their  objective  function  values  are  the  same,  it  follows  that  a  is  an  optimal  solution  of  the  primal  problem 
(4).  This  completes  the  proof  of  the  lemma.  ♦ 


It  follows  from  Lemma  1  that  the  optimal  cost  vector  d  of  the  inverse  spanning  tree  problem  is  given 
by  d-  =  Cj  +  a-  for  each  j  =  1, ... ,  m. 

We  now  analyze  the  running  time  of  the  successive  shortest  path  algorithm.  The  algorithm  solves 
at  most  I N   I  shortest  path  problems  with  arc  lengths  as  f]-  's  which  are  nonnegative.  Using  Fibonacci  heap 

implementation  of  Dijkstra's  algorithm  due  to  Fredman  and  Tarjan  [1984],  each  shortest  path  problem  can 
be  solved  in  0(m'  +  n'  log  n')  time.  Consequently,  this  approach  can  solve  the  inverse  spanning  tree  problem 
in  0(  I N^  I  (m'  +  n'  log  n))  =  O(n^)  time,  because  I N^  I  =  (n-1),  n'  =  (m+2),  and  m'  =  0(nm). 

Improved  Algorithm 

The  bottleneck  step  in  our  approach  is  the  successive  applications  of  Dijkstra's  algorithm  with 
each  application  requiring  0(m'  +  n'  log  n)  time.  We  will  now  describe  an  improved  implementation  of 
Dijkstra's  algorithm  for  the  case  when  n  <  (m'  +  n'  log  n).  In  this  case,  we  show  that  we  can  perform  all  the 
applications  of  Dijkstra's  algorithm  in  0(n  )  total  time  using  data  structures  simpler  than  Fibonacci  heap. 

Dijkstra's  algorithm,  when  applied  on  the  residual  network  G'(x)  with  arc  lengths  ff-  's,  to  find  a 
shortest  path  from  node  s  to  node  i,  maintains  a  distance  label  d(i)  with  each  node  i  e  N  .  A  distance  label 
d(i)  is  either  finite  or  infinite;  if  it  is  finite,  then  it  denotes  the  length  of  some  directed  path  from  node  s  to 
node  i;  otherwise,  it  implies  that  a  directed  path  to  node  i  is  yet  to  be  discovered.  A  finite  distance  label 
d(i)  is  permanent  if  it  is  guaranteed  to  be  the  shortest  path  length  to  node  i,  and  is  temporary  otherwise.  In 
each  iteration,  Dijkstra's  algorithm  selects  a  minimum  temporary  distance  label  d(i),  makes  it  permanent, 
and  examines  each  arc  (i,  j)  e  A'(i)  to  update  d(j)  =  min{d(j),  d(i)  +  f^j).    The  algorithm  terminates  when  all 

distance  labels  become  permanent. 

We  make  two  changes  in  Dijkstra's  algorithm  to  improve  its  worst-case  complexity  when  applied 
toG'. 

Change  1.  Permanently  label  node  t  as  soon  as  it  has  the  minimum  distance  label  among  temporarily 
labeled  nodes.     Then     terminate  Dijkstra's  algorithm. 

Recall  that  the  successive  shortest  path  algorithm  requires  a  shortest  path  from  node  s  to  node  t 
and  such  a  path  becomes  available  as  soon  as  node  t  is  permanently  labeled.  Hence  there  is  no  need  to 
permanently  label  more  nodes. 

Recall  from  our  previous  discussion  that  in  every  iteration  the  successive  shortest  path  algorithm 
updates  the  node  potentials  as  7t(i)  =  jr(i)  -  d(i)  for  each  node  i  in  N ,  which  ensures  that  the  reduced  costs 
remain  nonnegative.  In  case  we  terminate  Dijkstra's  algorithm  prematurely,  we  need  to  update  the  node 
potentials  as  7t(i)  =  7t(i)  +  max{d(t)  -  d(i))  for  each  permanently  labeled  node  i  in  N'.  It  follows  from  Lemma 
2  proved  below  that  updating  the  node  potentials  takes  0(n)  time  and,  as  shown  in  Ahuja,  Magnanti  and 
Orlin  [1993],  it  also  ensures  that  the  arc  reduced  costs  remain  nonnegative. 


An  immediate  byproduct  of  Change  1  is  the  following  lemma. 

Lemma  2.    Dijkstra's  algorithm  pernianenti}/  labels  at  most  2n  nodea. 

Proof.  In  the  residual  network  G'(x),  nodes  m  N  u  N  are  either  matched  or  unmatched.  Since  G'(x) 
permits  at  most  I  N  I  =  (n-1)  units  of  flow  to  be  sent  from  node  s  to  node  t,  there  will  be  at  most  (n  -  1) 
matched  nodes  in  N  and  at  most  (n-1)  matched  nodes  in  N  .  Therefore,  after  at  most  2(n  -  1)  nodes  have 
been  permanently  labeled,  an  unmatched  node  in  N  ,  say  node  k,  will  be  permanently  labeled. 
Immediately  thereafter,  arc  (k,  t)  is  examined,  which  is  the  uruque  arc  emanating  from  node  k,  and  d(t)  is 
updated  to  d(t)  =  d(k)  +  fj^j  =  d(k)  (because  (^^  =  fy.^  =  0).    Since  in  Dijkstra?  algorithm,  distance  labels  of 

permanently  labeled  nodes  are  non-decreasing,  it  follows  that  node  t  acquires  the  minimum  distance  label, 
and  in  the  next  iteration  Dijkstra's  algorithm  permanently  labels  it.  In  total,  the  algorithm  permanently 
labels  at  most  2n  nodes.  ♦ 

We  now  explain  the  second  change.  We  partition  the  arc  adjacency  list  A'(i)  of  each  node  i  e  N  in 
G'(x)  into  two  parts  M(i)  and  U(i),  where  M(i)  =  ((i,  j)  e  A'(i)  :  node  j  is  matched)  and  U(i)  =  j(i,  j)  e  A'(i)  : 
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node  j  is  unmatched}.  Notice  that  I  M(i)  I  <  n,  but  U(i)  can  be  as  big  as  INI  -  INI,  which  is  0(m).  In  the 
second  change,   Dijkstra's  algorithm  refrains  from  examining  all  arcs  in  U(i). 

Change  2.  In  Dijkstra's  algorithm,  ivhen  node  i  is  permanentli/  labeled,  then  examine  all  arcs  in  M(i)  but 
only  the  smallest  cost  arc  in  U(i). 

We  now  show  that  Change  2  does  not  affect  the  correctness  of  Dijkstra's  algorithm.  We  first 
observe  that  rt(j)  =  0  for  each  unmatched  node  j  in  N  .  (This  is  true  at  initialization,  and  7t(j)  is  not  updated 
until  node  j  is  matched.)  So  the  arc  in  U(i)  with  the  smallest  cost  is  also  the  arc  in  U(i)  with  smallest 
reduced  cost.  Suppose  that  we  apply  Dijkstra's  algorithm  without  Change  2;  that  is,  we  examine  all  arcs  in 
U(i).  Consider  the  first  time  an  unmatched  node  j  e  N  has  the  minimum  temporary  distance  label. 
Clearly,  d(j)  =  d(i)  +  fj-  for  some  permanently  labeled  node  i  e  N    such  that  (i,  j)  e  A  .   We  claim  that  arc 

(i,  j)  is  the  arc  in  U(i)  with  minimum  reduced  cost.  For,  if  this  is  not  true  and  (i,  k)  is  the  minimum  reduced 
cost  arc  in  U(i),  then  d(k)  =  d(i)  +  ffi^  <  d(i)  +  fj;  =  d(j),  contradicting  that  node  j  is  the  minimum  temporary 

distance  label  in  N  .  This  argument  establishes  that  if  instead  of  examining  all  arcs  in  U(i)  we  examine 
orJy  the  least  reduced  cost  arc  in  U(i),  Dijkstra's  algorithm  will  run  correctly. 

A  byproduct  of  Change  2  is  the  following  lemma. 

Lemma  3.    In  any  iteration,  Dijkstra's  algorithm  will  have  at  most  In  temporary  distance  labels. 

Proof.  Each  temporary  distance  label  is  caused  by  some  permanently  labeled  node;  we  say  that  it  was 
caused  by  the  node  which  modified  it  last.  Now  notice  that  each  node  i  e  N  is  either  temporarily  labeled 
or  permanently  labeled;  if  it  is  permanently  labeled  then  it  can  cause  several  temporary  distance  labels  of 
matched  nodes  in  N  but  at  most  one  unmatched  node  in  N  .  Since  there  are  at  most  (n  - 1)  matched  nodes  in 
N  ,  there  will  be  at  most  2n  temporary  distance  labels  in  any  iteration.  ♦ 


We  can  now  analyze  the  worst-case  complexity  of  the  successive  shortest  path  algorithm  that  uses 
Dijkstra's  algorithm  incorporating  Change  1  and  Change  2.  It  follows  from  Lemma  2  that  Dijkstra's 
algorithm  will  perform  0(n)  iterations.  It  follows  from  Lemma  3  that  in  each  iteration  it  can  identify  in 
0(n)  time  a  node  with  the  minimum  temporary  distance  label,  say  node  i.  Examining  all  arcs  in  M(i)  and 
the  miriimum  cost  arc  in  U(i)  also  takes  0(n)  time  plus  the  time  to  identify  the  minimum  cost  arc  in  U(i). 
Thus  each  application  of  Dijkstra's  algorithm  takes  0(n  )  time  plus  the  time  needed  to  identify  minimum 
cost  arcs  in  U(i).  Therefore,  the  successive  shortest  path  algorithm,  which  applies  Dijkstra's  algorithm  at 
most  n  times,  runs  in  0(n  )  time  plus  the  time  needed  to  identify  minimum  cost  arcs  in  U(i). 

We  now  explain  how  to  identify  minimum  cost  arcs  in  U(i)  quickly.  Recall  that  the  arc  in  U(i)  with 
the  minimum  reduced  cost  is  also  the  arc  in  U(i)  with  minimum  cost.  So  we  maintain  the  arcs  in  U(i)  for 
each  node  i  e  N  using  a  binary"  heap  with  cost  of  the  arc  (i,  j)  as  its  key.  These  heaps  are  constructed  once 
at  the  beginning,  and  are  updated  after  each  application  of  Dijkstra's  algorithm,  so  that  they  can  be  reused 
in  the  next  application.  Initially,  U(i)  =  A'(i)  and  these  heaps  for  all  the  nodes  can  be  constructed  in  O(m') 
time  (see,  for  example,  Cormen,  Leiserson  and  Rivest  [1990]).  In  a  binary  heap,  each  heap  operation,  like 
(i)  identifying  a  minimum  cost  arc  in  U'(i);  and  (ii)  deleting  an  arc  in  U'(i)  can  be  performed  in  0(log  n) 
time.  Suppose  during  an  application  of  Dijkstra's  algorithm,  an  additional  node  in  N  ,  say  node  k,  get 
matched.  We  then  need  to  update  some  heaps.  To  do  so,  we  consider  each  arc  (i,  k)  e  A  with  i  e  N  and 
remove  it  from  U(i);  this  takes  a  total  of  0(log  n)  time  per  arc  and  a  total  of  0(n  log  n)  time  for  node  k.  The 
total  time  taken  by  the  heap  operations  in  all  the  applications  of  Dijkstra's  algorithm  is  0(m'  +  n  log  n). 
Since  m'  =  0(nm)  and  m  =  0(n  ),  the  heap  operations  also  take  0(n  )  time.  We  can  summarize  the 
discussion  in  this  section  by  the  following  theorem. 

Theorem    1.  The  successive  shortest  path  algorithm  can  solve  the  unbalanced  assignment  problem  and, 
hence,  the  inverse  spanning  tree  problem  in  0(n  )  time. 


4.  WEIGHTED  INVERSE  SPANNING  TREE  PROBLEM 

In  this  section,  we  consider  the  weighted  spanning  tree  problem.  ITiis  problem  is  a  generalization  of 
the  inverse  spanning  tree  problem,  where  the  objective  is  to  minimize  the  weighted  deviation  from  the 
given  cost  vector.  Suppose  we  associate  a  nonnegative  weight  w-  with  each  arc  a-  e  A.  Then  the  weighted 

inverse  sparming  tree  problem  can  be  formulated  as  the  following  mathematical  program: 
Mininruze  2>_j^  w-  a-  -   ^^-^    w-  a-      subject  to  (3b)  and  (3c). 

Using  exactly  the  same  method  we  used  for  the  unit  weight  case,  the  weighted  inverse  sparirung 
tree  problem  can  be  reformulated  as 


Maximize   £       i  w-  a^  -  Z      t  w-  a     subject  to  (4b)  and  (4c). 
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The  dual  of  this  problem  is  as  follows: 

'  fi 

(i-j)GA^ 


Minimize  Z  o^ii'^ij  (^^) 


subject  to 

^  (j:  (i,j)  e  AOj  "ij  -  ^i       ^°^  ^^^  ^°^^  '  ^  ^^'  (8^) 

■  ^  |i:  (i,j)  e  aO)  "ij  -  -  ^j     ^or  each  node  j  €  N^,  (8c) 

Xjj  >  0     for  each  arc  (i,  j)  e  A^,  (8d) 

where  f--  =  (C:  -  Cj).  The  formulation  (8)  is  a  variant  of  the  well  known  standard  transportation  problem.   In 
the  standard  transportation  problem,  the  constraints  (8b)  and  (8c)  are  in  the  equality  form  and  ^jeNl 

W:  =  ZjcmZW:,  which  are  not  satisfied  in  the  variant.    We  refer  to  the  formulation  (8)  as  the  unbalanced 

transportation   problem.    We  will  show  that  an  adaptation  for  the  cost  scaling  algorithm  for  bipartite 
networks  can  solve  the  unbalanced  transportation  problem  in  0(n    m  log(nC))  time,  where  C  =  max{lc:l :  aj  e 

A). 

The  cost  scaling  algorithm  requires  that  the  mass  balance  constraints  are  in  the  equality  form, 
which  (8)  does  not  satisfy.  To  satisfy  this  condition,  we  first  construct  the  network  G'  =  (N',  A')  as  described 
in  Section  3  with  the  modification  that  the  capacity  of  the  arc  (s,  i)  with  i  e  N  is  set  to  w^  and  the 
capacity  of  the  arc  (j,  t)  with  j  e  N  is  set  to  w-.  We  set  b(s)  =  -b(t)  =  min(Zjg  ^i  Wj ,  Z-  j^2  wj  +  1.  Also 
observe  that  the  unbalanced  transportation  problem  in  (8)  reduces  to  a  minimum  cost  flow  problem  Ln  G'. 

Goldberg  and  Tarjan  [1987]  describe  a  cost  scaling  algorithm  that  can  solve  the  minimum  cost  flow 
problem  on  the  network  G'  =  (N',  A)  in  0((n')  log(n'C))  time,  where  C  is  the  largest  magnitude  of  the  arc 
costs.  Ahuja,  Orlin,  Stein,  and  Tarjan  [1994]  describe  an  improved  implementation  of  the  cost  scaling 
algorithm  for  those  bipartite  networks  where  one  part  is  considerably  smaller  than  the  other  part.  This 
algorithm  can  solve  the  minimum  cost  flow  problem  in  G'  in  0((IN  I)  +  (IN  I  m)  log(nC)).  Since  IN  I  =  (n  - 
1),  m'  =  0(nm),  the  running  time  of  this  algorithm  becomes  0(nTn  log(nC)). 

Let  X  denote  the  optimal  primal  solution  and  n  denote  the  optimal  dual  solution  of  the  minimum 
cost  flow  problem.    For  these  solutions,  we  obtciin  a  in  the  following  manner: 

lo       if  node  i  e  N    and  arc  (s,  i)  is  not  saturated 
'   ~  [7t(i)  if  node  i  6  N    and  arc  (s,  i)  is  saturated 

Jo        if  node  j  e  N    and  arc  (j,  t)  is  not  saturated 
)   ~  l7:(j)    if  node  j  g  N^  and  arc  (j,  t)  is  saturated. 

In  a  manner  analogous  to  the  unweighted  case,  it  can  be  shown  that  a  defined  in  the  above  manner  is 
optimal  for  the  weighted  inverse  spanning  tree  problem.  We  have  thus  proved  the  following  theorem. 
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Theorem    2.    The  cost  scaling  algorithm  for  bipartite  networks  can  solve  the  unbalanced  transportation 
problem,  and  hence  the  weighted  inverse  spanning  tree  problem,  in  0(n^m  log(nC))  time. 


5.  MINIMAX  INVERSE  SPANNING  TREE  PROBLEM 

In  this  section,  we  consider  the  variation  of  the  inverse  spanrung  tree  problem  where  the  objective  is 
to  minimize  the  maximum  arc  deviation  instead  of  minimizing  the  sum  of  the  arc  deviations.  This  problem 
can  be  formulated  in  the  following  manner: 

Minimize  [max  {Ipjl  :  l<j<m)]  (9a) 

subject  to 

c  +  a-  <  Cj  +  a-  for  each  i  e  P-  and  for  each  j  =  n, ... ,  m, 

or,  alternatively, 

a-  -  a   >  c   -  c-  for  each  i  e  P-  and  for  each  j  =  n, ... ,  m,  (9b) 

a-  <  0  for  each  i  =  1  to  (n-1),  and  a-  >  0  for  each  j  =  n  to  m.  (9c) 

Let 

8  =  max  [Cj  -  c- :  for  each  i  e  P-  and  for  each  j  =  n, ... ,  m).  (10) 

If  8  <  0,  then  c  is  an  optimal  cost  vector  for  T*.  Thus,  we  consider  the  case  when  8  >  0.  It  is  easy  to  see  that 
8/2  is  a  lower  bound  on  the  objective  function  value  of  the  minimax  inverse  spanning  tree  problem,  because 
for  some  arc  pair  ot-  -  a^  >  8  and  one  of  them  will  be  at  least  8/2  in  magnitude.  It  is  also  easy  to  see  that  ttj 
=  -8/2  for  each  i,  1  <  i  <  (n-1),  and  a-  =  8/2  for  each  j,  n  <  j  <  m,  achieves  this  lower  bound  and  satisfies  every 
constraint  in  (9).   We  have  established  the  following  result. 

Lemma  4.  a  =  minIO,  -5/21  for  each  i,  1  <  i  <  (n-1),  and  U:  =  maxjO,  5/21  for  each  j,  n  < j  <  m,  with  5defined  by 
(10)  is  an  optimal  solution  of  the  minimax  inverse  spanning  tree  problem. 

We  now  study  the  time  complexity  of  the  minimax  inverse  sparming  tree  problem.  The  computation 
of  8  is  the  bottleneck  operation  in  the  algorithm.  If  done  in  an  straightforward  fashion,  it  takes  0(mn) 
time.  However,  we  can  do  it  in  0(n  )  time  as  described  next.  Let  £=  (ej-)  denote  an  nxn  matrix  whose  ij-th 

element  gives  the  largest  cost  of  a  tree  arc  in  the  unique  path  in  T  connecting  nodes  i  and  j.  We  can 
determine  the  i-th  row  of  £  in  0(n)  time  by  performing  a  search  of  T*  starting  at  node  i.   Therefore,  the  "E 

'y 

matrix  can  be  computed  in  0(n  )  time.  We  claim  that 

8  =  max[ej[j]j^[j] -Cj:n<j<m],  (11) 

where  t[j]  and  h[j]  respectively  denote  the  tail  and  head  nodes  of  the  arc  a-.  The  claim  follows  from  the  fact 
that  for  any  nontree  arc  a:  the  tree  arc  aj  with  the  largest  value  of  Cj  will  attain  the  largest  value  of  Cj  -  c- 
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and  e^iji  j^rM  gives  this  value.  Once  the  matrix  !Eis  available,  we  can  compute  8  in  0(m)  time  using  (11).  We 
have  thus  established  the  following  theorem. 

Theorem  3.    The  minimax  inverse  spanning  tree  problem  can  be  solved  in  0(n  )  time. 
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